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ABSTRACT 



The estimation and analysis of large-scale bulk flow moments of peculiar velocity 
surveys is complicated by non-spherical survey geometry, the non-uniform sampling 
of the matter velocity field by the survey objects and the typically large measurement 
errors of the measured line-of-sight velocities. Previously, we have developed an op- 
timal 'minimum variance' (MV) weighting scheme for using peculiar velocity data to 
estimate bulk flow moments for idealized, dense and isotropic surveys with Gaussian 
radial distributions, that avoids many of these complications. These moments are de- 
signed to be easy to interpret and are comparable between surveys. In this paper, 
we test the robustness of our MV estimators using numerical simulations. Using MV 
weights, we estimate the bulk flow moments for various mock catalogues extracted 
from the LasDamas and the Horizon Run numerical simulations and compare these 
estimates to the moments calculated directly from the simulation boxes. We show that 
the MV estimators are unbiased and negligibly affected by non-linear flows. 
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1 INTRODUCTION 

Peculiar velocities are a sensitive probe of the underlying 
large-scale matter density fluctuations in our Universe. In 
particular, large, all-sky surveys of the peculiar velocities of 
galaxies or clusters of galaxies can provide important con- 
straints on cosmological parameters. However, studies of pe- 
culiar velocities suffer from several drawbacks, including (i) 
the presence of small-scale, non-linear flows, such as infall 
into clusters, can potentially bias analyses which typically 
rely on linear theory, (ii) sparse, non-uniform sampling of 
the peculiar velocity field can lead to aliasing of small-scale 
power on to large scales and bias due to heavier sampling of 
dense regions, (iii) large measurement uncertainties of indi- 
vidual peculiar velocity measurements, particularly for dis- 
tant galaxies or clusters, make it necessary to work with 
large surveys in order to extract meaningful constraints. 

These difficulties have often been addressed by calculat- 
ing statistics from peculiar velocity surveys that are designed 
to primarily reflect large-scale flows which are well described 
by linear theory. The most common statistic used is the bulk 
flow, which represents the average motion of the objects in a 



survey. The bulk flow statistic has been investigated exten- 



sively by many groups ( 


Dressier & Faber|1990 Kaiser 1991 


Fcldman fc Watkins|1994||Jaffe fc Kaiser|1995||Strauss et al. 
1995||Watkins & Feldman[l995||Hudson et al.|1999||2004||da 
Costa et al.||2000a| |Parnovsky L Tugay||2004| |Sarkar, Feld- 


man, & Watkins||2007| |Kashlinsky et al.||2008[ |2010| |Ma, 


Gordon, & Feldman| 2011 


Macaulay et al.| 2011| Nusser, 


Branchini, & Davis 201 1| 


Nusser & Davis 201 1| | Abate & 


Feldman|2012 


Turnbull et al. 2012 


1. However, bulk flow es- 



timates can be difficult to interpret since how they sample 
the peculiar velocity field depends strongly on the character- 
istics of the particular survey being considered. In addition, 
results from bulk flow analyses have often been controversial, 
highlighting the importance of developing a robust bulk flow 
statistic that is easy to interpret and that can be compared 
between surveys with different geometries. 



In |Watkins, Feldman, fc Hudson|2009 
I) and|Feldman, Watkins, fc Hudson||2010 



(hereafter Paper 
(hereafter Paper 
II), we developed the 'minimum variance' (MV) moments 
that were designed to estimate the bulk flow of a volume 
of a given scale rather than a particular peculiar velocity 
survey. We stress that the MV moments do not represent 
the bulk motion of the galaxies in a survey, rather they are 
estimates of the bulk motion of a given volume of space. The 
MV algorithm was designed to make a clean estimate of the 
large-scale bulk flow as a function of scale using the avail- 
able peculiar velocity data. Essentially, each velocity datum 
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in a real survey is weighted in a way that minimizes the vari- 
ance of the difference between the MV-weighted bulk flow 
of the real survey and an idealized survey bulk flow, on a 
characteristic scale R. The MV analysis suggested bulk flow 
velocities well in excess of expectations from a A cold dark 
matter (ACDM) model with 7-year Wilkinson Microwave 
Anisotropy Probe (WMAP7; Larson et al.|2011 l central pa- 
rameters. 

Indeed there are a few recent observations that sug- 
gest that the standard model may be incomplete: large-scale 
anomalies found in the maps of temperature anisotropies in 
the cosmic microwave background (CMB; |Copi et al.pOlO 



Sarkar et al. 2011 Bennett et al. 20111; a recent estimate 



(Lee & Komatsu 20101 of the occurrence of high- velocity 



merging systems such as the bullet cluster is unlikely at a 
~6ct level; large excess of power in the statistical clustering of 
luminous red galaxies (LRG) in the photometric Sloan Dig- 



ital Sky Survey (SDSS) galaxy sample (Thomas, Abdalla 



& Lahav 20111; Kovetz, Ben-David, & Itzhaki (20101 find 



a unique direction in the CMB sky determined by anoma- 
lous mean temperature ring profiles, also centred about the 
direction of the flow detected above; larger than expected 
cross-correlation between samples of galaxies and lensing 
of the CMB ( |Ho et al.||2008} |Hirata et aL||2008) ; Type la 
Supernovae (SNIa) seem to be brighter than expected at 
high redshift ( |Kowalski et al.|20 08); small voids ( ~ 10 Mpc) 
are observed to be much emptier than predicted ( |Gottlober| 
et al. 120031); observations indicate denser high concentration 



cluster haloes than the shallow low concentration and den- 
sity profile predictions ( |de Blok||2005| |Gentile et al.||2005[ ). 

In this paper, we use N-body simulations to investigate 
the robustness of our MV scheme for estimating the bulk 
flow moments of the velocity field, over a volume of a partic- 
ular scale, R. First we extract a mock catalogue (described in 
Sec.[3| from N-body simulations. Given this mock catalogue, 
we use our MV algorithm (described in Sec. [5]) to estimate 
the bulk flow moments {u x , u y ,u z } of the velocity field over 
a volume of a particular scale. Then we position ourselves 
in the N-body simulation box at the location of the centre 
of the mock catalogue, and calculate the Gaussian-weighted 
moments {V x ,V y ,V z } by averaging the velocities of all the 
galaxies in the simulation box; each galaxy being weighted 
by a Gaussian radial distribution function f(r) = e~ r ' 2il . 
Note that a large number of particles in the simulation box 
is preferable to accurately calculate the Gaussian moments 
of the velocity field. Finally, we compare the MV-weighted 
moments {u x ,u y ,u z } with the Gaussian- weighted moments 
{V x ,Vy,V z } in Sec. [4] A close match between the two would 
indicate that the MV scheme accurately estimates the Gaus- 
sian bulk flow on scale R. 

It is worth mentioning here the reason for our choice 
of a Gaussian profile f(r) over, for example, a Tophat filter 
in developing the MV formalism. A Tophat filter gets con- 
tribution from small scales. As such, bulk flow calculated 
using a Tophat filter can be compared with theoretical ex- 
pectations only if the observed velocity field is reasonably 
dense and uniform, so that the small-scale systematics av- 
erage out. However, observations typically are sparse and 
non-uniform with large uncertainties. This leads to aliasing 



of small-scale power on to large scales, making comparison 
with theory difficult. A Gaussian filter, on the other hand, 
gets very little contribution from small scales and isolates 
the small-scale effects present in real surveys, thereby mak- 
ing comparison with theoretical predictions meaningful. Our 
MV method is specifically designed to convert the observed 
velocity field into a Gaussian field on a user-specified scale 
R. 

In Sec. 12] we review the MV formalism. In Sec. [3] we 
describe the simulations we use and surveys we model to 
extract the mock catalogues. In sec. [4] we compare the MV- 
weighted bulk flow moments with the Gaussian-weighted 
moments. We discuss our results and conclude in Sec. [5] 



2 THE MINIMUM VARIANCE METHOD 

Individual radial peculiar velocity measurements are 
plagued by large uncertainties and contributions from small- 
scale, non-linear processes which are difficult to model the- 
oretically. Both of these problems can be greatly reduced if 
instead of considering individual velocities an average veloc- 
ity over a sample, commonly called the bulk flow, is worked 
with. The three components of the bulk flow Ui can be writ- 
ten as weighted averages of the measured radial peculiar 
velocities of a survey, 



1 S n 



(1) 



where S„ is the radial peculiar velocity of the nth galaxy 
of a survey, and Wi, n is the weight assigned to this veloc- 
ity in the calculation of ui. Throughout this paper, sub- 
scripts i,j and k run over the three components of the bulk 
flow, while subscripts m and n run over the galaxies. By 
far the most common weighting scheme used in studies of 
the bulk flow, which we will call the maximum likelihood 
estimate (MLE) method, is obtained from a maximum like- 



lihood analysis introduced by Kaiser (19881. By modelling 



galaxy motions as being due to a uniform flow and assuming 
Gaussian-distributed measurement uncertainties, the likeli- 
hood function 



L[Ui\{S„, <7„, <7»}] = Y[ 



1 



exp 



2 Tn^lLi) 

orl +a? 



(2) 

is obtained, where f n is the unit position vector of the nth 
galaxy, er n is the measurement uncertainty of the nth galaxy 
and <t, is a ID velocity dispersion accounting for smaller 
scale motions. Maximizing this likelihood gives a bulk flow 
estimate of the form of Eq. [l] with weights 

?nj (3) 



1=1 



where 



(4) 



These weights play the dual roles of accounting for geomet- 
rical factors, e.g. picking out the x component of velocities 
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in a calculation of u x , and down- weighting velocities with 
large uncertainties. However, the fact that velocity uncer- 
tainties are typically proportional to distance, together with 
the sparseness of velocity catalogues at their outer edges, 
means that nearby objects are greatly emphasized in calcu- 
lations of the MLE bulk flow. Indeed, studies of the window 
functions of these moments (Paper I) have shown that MLE 
bulk flow moments of a survey are typically sensitive to flows 
on scales much smaller than the survey's physical diameter, 
thus complicating their interpretation. 

In Paper I, we introduced an alternative to the MLE 
weights that yield bulk flow moments that are much easier 
to interpret. First, we imagine an idealized survey contain- 
ing radial velocities that well sample the velocity field in a 
region. This survey consists of a large number of objects, 
all with zero measurement uncertainty. For simplicity, the 
radial distribution of this idealized survey is taken to be a 
Gaussian profile of the form f(r) oc e~ r l 2R , where R gives 
a measure of the depth of the survey. This idealized sur- 
vey has easily interpretable bulk flow components Ui that 
are not affected by small-scale aliasing and which reflect the 
motion of a well-defined volume. Note that the difference 
between Ui and Vi (see Sec.[T]for the definition of Vi) is that 
Ui is calculated from an ideal (dense and isotropic) survey, 
while Vi is based on the galaxy distribution obtained from 
N-body simulations. In the limit that the simulations are 
dense enough, Vi will converge towards Ui. 

Our goal is to construct estimators for the idealized 
survey bulk flow components Ui, out of the measured radial 
peculiar velocities S n and positions r„ contained in a real 
survey. We assume that S n can be expressed as S n = v n + S n , 
where v„ is the radial component of the linear peculiar veloc- 
ity field at the location of the object and S n accounts for the 
measurement noise as well as any non-linear flow, e.g. infall 
into a cluster. In order to calculate the weights to use for the 
bulk flow estimators, we minimize the variance {(ui — Ui) 2 }, 
where the average is over different realizations of a particu- 
lar matter power spectrum. Expanding this expression using 
Eq. [I] for the bulk flow estimate, we obtain 

{(Ui — Ui) 2 ) = ^2wi :m Wi ]n {SmS n ) + (Uf) (5) 
m,n 

-2^2,W i>n {UiV n ), 
n 

where we have used the fact that the measurement error 
included in S„ is uncorrelated with the bulk flow Ui. 

Before we minimize this expression with respect to the 
weights Wi : „, we impose the following constraint introduced 
in Paper II. Suppose that the velocity field were a pure bulk 
flow, so that S n = Yli Uigi(v n ) + S n , where Ui are the three 
bulk moments {U x ,U y ,U z }; <7i(r n ) are the direction cosines 



of the nth galaxy {r ntX , \ 



?n,z} and 8 n is the noise due to 



measurement error. We ask that the estimators Ui give the 
correct amplitude for the flow on average (over different re- 
alizations of the universe), namely that (u;) = Ui. Plugging 
the expression for S„ into Eq. [T] give the constraint that 

^Wi,„fti(r„) = 5ij, (6) 



8ij being the Kronecker delta. This set of three con- 
straints is implemented using Lagrange multipliers, so that 
we derive the desired weights by taking a derivative of the 
expression 



\] u) itm Wi in (S m Sn) + (U 2 ) - 2y]m 1> „([/,D„) 



(7) 



+E A « (y^ w '."gj( r ") 

j = l \ n 

with respect to Wi, n and setting the resulting expression 
equal to zero. Solving for the weights then gives 

Wi, n = G mn I (S m Ui) - £ ]C Xi i9j( r ™) ) > (8) 
m V j=l / 

where G is the covariance matrix of the individual measured 
velocities, G mn = {S m S n ). The Lagrange multipliers can be 
found by plugging Eq. [8] into Eq. [6] and solving for \j , 

3 



A, 



E 



M ik ( ^2G,nn{SmU k )gj(r n ) - S jk 



where the matrix M is given by 



(9) 



(10) 



In linear theory, the correlation (S m Ui) and the covari- 
ance matrix G that appear in our expression for Wi,„ can 
be calculated for a given matter power spectrum P(k) (for 
details see Paper II): 



JV' 

(S m Ui) = 22 w' i>n l (SmV n >) 
n' = l 

X rr2 n l.l 

Wi, n > — 



(11) 



n' = l 

where 
in' , = A' 1 ,j 

ij AT/ 



27T 2 



dk P{k)f mn ,{k), 



3=1 



are the weights of an ideal, isotropic survey consisting of 
TV' exact radial velocities v n i measured at randomly selected 
positions r' n , with 



n' 



N' 



t2 1.1 
27T 2 



g =- ■ / dk P(k)f mn (k) + 6 mn (* 2 . + <%) (12) 

= (? n -v(r„) f m • v(r m )) + 8 mn (a 2 + a 2 .), 
where f mn (k) is the angle-averaged window function: 
d 2 k 



fmn (&) 



4tt 



f n ' k 



(13) 



x exp (^ik k - (r 



m 1 n 



© 0000 RAS, MNRAS 000, 000-000 



4 Agarwal & Feldman & Watkins 



-200 
200 



DEEP-survey 




-200 

aoo 




-200 -100 100 200 
x (Mpc/h) 



50 100 150 200 
r (Mpc/h) 



200 -100 100 200 
x (Mpc/h) 





400 





50 100 150 200 250 
r (Mpc/h) 



Figure 1. Top row: DEEP catalogue (left) and its radial distri- 
bution (right). Bottom row: DEEP mock catalogue (left) and its 
radial distribution (right). 



Thus, given a peculiar velocity survey and a power spec- 
trum model P(k) we can calculate the optimum weights Wi t n 
(see Eq. [8| for estimating the MV momen ts (see Eq.[T|). We 
use the power spectrum model give n by |Eisenstein fc Hu| 
( |1998[ ) with WMAP7 (|Larson et al.||2011|) central parame- 



ters. Using the optimum weights Wi t n from Eq.[§] the angle- 
averaged tensor window function Wfjik) can be constructed 
(for details see Paper II) as 



Wl{k) 



= E 



yJi 7 m Wj , n Jm n (fe) • 



(14) 



The diagonal elements Wfi are the window functions of 
the bulk flow components Ui. Given a velocity survey, W?j 
estimated using the MV weights are the closest approxima- 
tion to the ideal window functions. See Paper I for the MV- 
estimated window functions of the bulk flow components for 
a range of surveys. 



3 MOCK CATALOGUES 
3.1 N-body simulations 

To check the robustness of our MV formalism, we calcu- 
lated the bulk flow moments directly from numerical simu- 
lations. The N-body simulations we use in our analysis are 
(i) the Large Suite of Dark Matter Simulations (LasDamas; 
hereafter LD; |McBride et~~aLl [20091 McBride et al. 2011 



in prefQ and (ii) Horizon Run (hereafter HR; 



Kim et al 

2009[ ). These are designed to model the SDSS observations 



Figure 2. Top row: COMPOSITE catalogue (left) and its radial 
distribution (right). Bottom row: COMPOSITE mock catalogue 
(left) and its radial distribution (right). The mock does not have 
as many close by objects as there are in the COMPOSITE cata- 
logue. 



The LD (HR) simulation parameters are fi m = 0.25 (0.26), 
n h = 0.04 (0.044), Q A = 0.75 (0.74), h = 0.7 (0.72), <r 8 = 
0.8 (0.794), n s = 1.0 (0.96) and L Bo x = 1 (6.592)/i _1 Gpc 
for the matter, baryonic and cosmological constant normal- 
ized densities, the Hubble parameter, the amplitude of mat- 
ter density fluctuations, the primordial scalar spectral index 
and the simulation box size, respectively. The HR simulation 
samples the density field at z — and identifies galaxies us- 



ing subhalos (Kim, Park, & Choi 2008 1. The LD simulations 



http: / /lss. phy.vanderbilt.edu/lasdamas/download. html 



a suite of 41 independent realizations of dark matter N-body 
simulations named Carmen, have information at z = 0.13. 
Using the Ntropy framework | |Gardner et al.|[2007 l, bound 
groups of dark matter particles (halos) are identified with 
a parallel friends-of- friends (FOF) code ( |Davis et al.||1985| . 
The cosmological parameters and the design specifications 
of the LD- Carmen and HR simulations are listed in Table [T] 
The LD-Carmen data we use consists of 41 independent 
realizations, each in a l/i -1 Gpc box with the same initial 
power spectrum, but a different random seed. We extract 
100 mock catalogues from each of the 41 LD boxes, for a 
total of 4100 mocks. The mock centres are randomly chosen 
inside the box. The mocks are extracted in a way that they 
come as close as possible to the radial distribution of real 
catalogues. The HR simulation is a single realization in a 
much bigger 6.592/i _1 Gpc box. As such, we extract 5000 
randomly distributed mocks. 
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Table 1. The cosmological parameters and the design specifications of the LD- Carmen and HR simulations. 





LD- Carmen 


HR 


Cosmological parameters 






Matter density, Q m 


0.25 


0.26 


Cosmological constant density, ^ A 


u. / o 


n 7a 


J-JojIVLJII U-UllolL V, i 


0.04 


0.044 


Hubble parameter, /i (100 km s ^ Mpc ^) 


n 7 


n 79 

U. 1 £ 


./T_111JJ11 L ULIC Ul IlldLrL^-l <J.Lllt>l b V 11 U(„ L UcL L1UI1S , U g 


8 


794 


Primordial scalar spectral index, n a 


1.0 


0.96 


Simulation design parameters 






Simulation box size on a side (/i _1 Mpc) 


1000 


6592 


Number of CDM particles 


1120 3 


4120 3 


Initial redshift, z 


49 


23 


Particle mass, m p (10 10 h^ 1 M ( ? ) ) 


4.938 


29.6 


Gravitational force softening length, / e (h~ 1 kpc) 


53 


160 



x (Mpc/h) 




Figure 3. The left-hand panel shows the distribution of galaxies around the location of the centre of a typical mock catalogue. Each 
galaxy is weighted with a Gaussian radial distribution function /(r) = e~ T / 2i? (here R = 50 h~ 1 Mpc). The radial distribution is shown 
in the right-hand panel. The MV formalism estimates the bulk flow of this Gaussian-weighted box, by only using the mock catalogues of 
the kind shown in Figs ^ and [2] (bottom rows). 



3.2 Catalogues 

We create mocks of three different peculiar velocity surveys 
from the simulations: i) The 'DEEP' compilation includes 



103 SNIa (|Tonry et al.|2003 


1, 70 Spiral Galaxy Culsters (SC) 


Tully-Fisher (TF) clusters ( 


Giovanelli et al.|1998||Dale et al. 



1999a| ), 56 Streaming Motions of Abell Clusters (SMAC ) 
fundamental plane (FP) clusters ( Hudson et al.|1999 [20041 



50 Early-ty pe Far Galaxies (E FAR) FP clus ters ( |Colless 
|et al.||2001[ ) and 15 TF clusters \ Willick|[il)99] ) . The DEEP 
catalogue consists of 294 data points with a characteristic 
MLE depth of 50 /i _1 Mpc, calculated using ^ w n r n / w n 
where the MLE weights are w n = l/(o"n + ""*)• In this paper, 
we assume a* = 150 km s . We have tried ct* = 150 — 450 
km s _1 and it does not change our results appreciably, ii) 



The SFI++ (Spiral Field I-band) catalogue ( |Masters et al. 



2006 Springob et al. 2007 20091 is the densest and most 



complete peculiar velocity survey of field spirals to date. 
We use the data from the corrected dataset ( Springob ct al. 
2009|), the sample consists of 2821 TF field galaxies. The 



characteristic depth is 34 /i _1 Mpc. iii) The 'COMPOSITE' 
catalogue is a compilation of the DEEP and SFI-|—r- cata- 



logues as well as the group SFI++ catalogue ( Springob et al 



2009[), the Early-type Nearby Galaxies (ENEAR; |da Costa 
et al.|2000b||Bernardi et al.|2002||Wegner et al.|2003| ) survey 
and a surface brightness fluctuations (SBF) survey (Tonry 
eTlL[[200l| . With 4481 data points, the COMPOSITE cat- 
alogue has a characteristic depth of 33 h~ Mpc. The DEEP 
and SFI+- h catalogues are completely independent whereas 
the COMPOSITE is a compilation of these and other cata- 
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logues. For further details on these catalogues see Papers I 
and II. 

We have used these particular catalogues to investigate 
the effect of geometry and density on our results. The rea- 
son for using these catalogues is that we want to compare 
the results using a very sparse catalogue (DEEP) and the 
better sky coverage and higher density of the COMPOSITE 
catalogue. We chose the SFI++ catalogue as an intermedi- 
ate case study. We tested our MV formalism on the DEEP, 
SFI++ and COMPOSITE mocks extracted from the LD 
and HR simulations. As we mentioned earlier, we extracted 
4100 mocks from the LD simulations and 5000 from the 
HR simulation. The results based on the 5000 mock surveys 
from the HR simulation are virtually identical to the ones 
for the LD simulations. As such, in the rest of this paper, we 
display results only for the 4100 mocks extracted from the 
LD simulations. Moreover, since our results for the SFI++ 
catalogue are very similar to the ones for the DEEP and 
COMPOSITE catalogues, we do not display SFI++ results. 

In Figs [1] and [2] we show the DEEP and COMPOS- 
ITE real catalogues (top rows) and a sample mock cata- 
logue (bottom rows). The N-body simulations do not have 
as many close by objects as there are in the COMPOSITE 
catalogue, which is why the COMPOSITE mocks match the 
radial distribution only beyond ~ 50/i _1 Mpc. 

In Fig. |3j we show the weighted distribution of galaxies 
around the location of the centre of a typical mock cat- 
alogue (left-hand panel) and its radial distribution (right- 
hand panel). Each galaxy is weighted with a Gaussian radial 
distribution function f(r) = e ~ r2/2R2 with R = 50/i _1 Mpc. 
The MV formalism is designed to obtain the best estimate 
of the bulk flow of this Gaussian-weighted box, by only us- 
ing the mock catalogues of the kind shown in Figs [T] and [2] 
(bottom rows). Note that the Gaussian- weighted box does 
not have a perfect Gaussian distribution but it comes close 
to being one. Denser simulations would be required to test 
the MV formalism more rigorously. 



DEEP COMPOSITE 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

A 







-400 200 200 400 -400 -200 200 



(km/.) (km/s) 

Figure 4. Histograms showing the normalized probability dis- 
tribution for the MV- and Gaussian-weighted bulk flow moments 
within a Gaussian window of radius R = 50 ft _1 Mpc for the direc- 
tions x and z in the top and bottom rows, respectively for the two 
types of mock catalogues in the LD simulations: DEEP (left-hand 
column) and COMPOSITE (right-hand column) as in Fig. [6] The 
MV-weighted bulk flow moments Ui are the solid histogram. The 
Gaussian-weighted moments Vi are shown as dashed histogram. 
We also superimpose a Gaussian centred at zero with width of 
the rms calculated. It is clear that the distributions of both the 
MV- and Gaussian-weighted moments are Gaussian distributed. 
We do not show the y-direction since it is statistically identical to 
the x-direction. The SFH — h catalogue shows very similar trends 
and so was not displayed. 



3.3 Mock extraction procedure 

Once we have identified a random point in the N-body sim- 
ulation box, we extract a set of galaxies that has the same 
radial selection function about this point as the catalogue we 
are creating mocks of. We do not impose the additional con- 
straint on the mocks that they must also have the same an- 
gular distribution as the real surveys for two reasons: (i) the 
N-body simulations are not dense enough to give us mocks 
that are exactly like the real surveys and (ii) the weights 
u>i t n of the real surveys typically depend only on the radial 
distribution and the velocity errors of the survey objects. 
Consequently, the mocks in Figs [T] and [2] have a relatively 
featureless angular distribution. To make the mocks more 
realistic, we also impose a 10° latitude zone-of-avoidance 
cut. 

From the simulations we find the angular position, 
the true line-of-sight peculiar velocity v s and the redshift 
cz — d s + v s for each mock galaxy, where d 3 is the true ra- 
dial distance of the mock galaxy from the random centre we 
selected, all in km s _1 . We then perturb the true radial dis- 



tance d 3 of the mock galaxy with a velocity error drawn from 
a Gaussian distribution of width equal to the corresponding 
real galaxy's velocity error, a„. Thus, d v = d 3 + Sd, where d p 
is the perturbed radial distance of the mock galaxy (in km 
s _1 ) and 8d is the velocity error. The mock galaxy's mea- 
sured line-of-sight peculiar velocity v p is then assigned to 
be v p — cz — dp, where cz is the redshift we found above. 
The reason for this procedure is that the weight we assign 
to each galaxy in the mock catalogues will then be similar 
to the weights of the real catalogues, since these depend on 
the radial distribution errors of the survey objects. 

This procedure of perturbing the distances d s and then 
assigning the velocities v p to the mock galaxies introduces 
a Malmquist bias. We have checked the effect of the bias by 
following a slightly different approach to generate the mocks. 
We used the exact distances d s and only perturbed the ve- 
locities as Vp = v s + 8d- We found the effect of Malmquist 
bias on our MV analyses to be negligible. 
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Figure 5. The MV bulk flow moments Ui versus the Gaussian- 
weighted moments Vi for _R = 50 h~ 1 Mpc for the two types of 
mock catalogues in the LD simulations: DEEP (left-hand column) 
and COMPOSITE (right-hand column). There are 4100 mocks for 
each of the catalogues. We show the moments u x and u z in the 
top and bottom rows, respectively. The MV- and the Gaussian- 
weighted moments are plotted against each other (dots) . A perfect 
correlation would put all 4100 dots on the diagonal. The rms 
scatter (km s — 1 ) in the MV moments is displayed at the top 
left-hand side of each panel and shown as dashed lines. We do 
not show the {/-direction since it is statistically identical to the 
rr-direction. The SFH — h catalogue shows very similar trends and 
so was not displayed. 




-400 -200 




-400 -200 



Figure 6. Histograms showing the normalized probability dis- 
tribution for the differences between the MV- and Gaussian- 
weighted moments for the x- and z- directions in the top and 
bottom rows, respectively. The solid histograms show the quanti- 
ties («; — Vi) for the 4100 mock catalogues extracted from the 41 
LD simulation boxes for R = 50 /i _1 Mpc: DEEP (left-hand col- 
umn) and COMPOSITE (right-hand column). Superimposed on 
the histograms are Gaussians centred at zero and with the same 
width, ((ui — Vi) 2 )2, as the corresponding histogram. The fact 
that the distributions are centred on zero demonstrates that the 
MV estimators are not biased. We do not show the {/-direction 
since it is statistically identical to the ^-direction. The SFI++ 
catalogue shows very similar trends and so was not displayed. 



4 BULK FLOW MOMENTS 

For each of the 4100 LD (5000 HR) mocks, we estimated 
the bulk flow moments {u x ,u y ,u z } using our MV weight- 
ing scheme (Sec. [2j|. We then compared the results to the 
Gaussian- weighted bulk moments {V x ,V y ,V z } calculated by 
going to the same central points for each of the 4100 LD 
(5000 HR) mock catalogues and averaging the velocities of 
all the galaxies in the simulation box, each galaxy being 
weighted by a Gaussian weight of width R = 50/i -1 Mpc. 
Although the results that we show here are for a particular 
scale of R = 50/i _1 Mpc, we have repeated our analysis for 
other values of R with similar results. It is worth mentioning 
here that since the position and the velocity of every galaxy 
in the N-body simulations are known exactly, their respec- 
tive uncertainties are zero. Here we present our results only 
from the LD simulations. The HR simulation shows very 
similar results. 

In Fig. [4j we show the probability distribution for the 
the 4100 MV-weighted bulk flow moments m (solid) and the 
Gaussian-weighted moments Vi (dashed) within a Gaussian 
window of radius R — 50 /i _1 Mpc for the LD simulations. As 
shown in Fig. [4] the distributions for the MV-estimated bulk 



flow moments (solid histogram) and the Gaussian-weighted 
moments (dashed histogram) are both Gaussian distributed. 
This is as expected for large scale moments and reflects 
the fact that non-linear motions, which can lead to non- 
Gaussian tails in the velocity distributions for individual 
galaxies, have been effectively averaged out. The widths of 
the distributions match well with the expectations from lin- 
ear theory, 



a 2 v (R) 



2tt 2 



dk P(k)Wy(kR), 



(15) 



where a v (R) is the RMS value of the peculiar velocity 
field smoothed with a suitable filter with a characteristic 
scale R; W v (k,R) is the window function (Fourier trans- 
form of the filter) and P(k) is the matter power spectrum. 



A ACDM model with WMAP7 (Larson et al. 20111 cen 



tral parameters, together with a Gaussian window function 



W v {k,R) = 



-(kB.y/2 



predicts a 110 km s width for 



R — 50 /i _1 Mpc, virtually identical to the ones shown in 
Fig. [4] In Paper II we estimated that for a ACDM model 
with WMAP7 central parameters, the chance of getting a 
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~ 400 km s _1 bulk flow for a survey on scales of 50 /i -1 Mpc 
is ~ 1 per cent. Examining Fig. [4] confirms that the proba- 
bility will be similarly small, fndeed, the frequency of mock 
catalogues with > 400 km s _1 was found to be comparable 
to the 1 per cent value. 

In Fig. [5j we show the bulk flow moments in the x- 
and z- directions (in the top and bottom rows, respectively) 
for the 4100 DEEP (left-hand column) and COMPOSITE 
(right-hand column) mock catalogues, extracted from the 
41 LD simulation boxes. The MV-weighted moments m and 
the corresponding Gaussian-weighted moments Vi are plot- 
ted against each other (dots) and the positive correlation be- 
tween the two is clearly visible. A perfect correlation would 
put all 4100 points on the diagonal. 

In Fig. [6j we show the probability distribution for the 
difference between the MV-weighted bulk flow moments u; 
and the Gaussian-weighted ideal moments V% for the 4100 
mock surveys from the LD simulations. A Gaussian centred 
at zero and with the same width as the probability distri- 
bution is also shown. The fact that the distributions are 
centred on zero demonstrates that the MV estimators are 
not biased. 

Given a mock catalogue, the theoretical expectation 
value for the width of the distribution, i.e. ((it; — Ui) 2 ) ? , can 
be calculated in linear theory using Eq.|5j Eq.[ll] Eq.[l2]and 
Eq. |13| To check the robustness of our MV method, this can 
then be compared with the distribution width ((m — Vi) 2 ) ? 
calculated directly from the simulations [the (ui — Vi) dis- 
tribution is shown in Fig. [6] using the same cosmological 
model. The theoretical widths {(m — Ui) 2 )? for the 4100 
LD mocks are shown in Table [2] columns 7-9. Since each 
mock catalogue has in principle a slightly different expec- 
tation for the width, we quote the average and standard 
deviation of the widths obtained from the set of the mock 
catalogues. The widths ((it; — Vi) 2 )? found in the simula- 
tions are shown in Table [2j columns 4-6. 

Comparing linear theory predictions [((«< — Ui) 2 )? in 
Table |5J columns 7-9] with the widths found in the numer- 
ical simulations [((m; — Vi) 2 )? , columns 4 - 6], we see that 
the distribution width found in the simulations are some- 
what different than the widths predicted by linear theory. 
This is due to the fact that the simulations are not dense 
enough and thus do not have enough galaxies to emulate 
an ideal survey. We explain this through Fig. [7] In the left- 
hand panel, we show the weighted distribution of galaxies 
around the location of the centre of a typical mock cata- 
logue. Each galaxy is weighted with a Gaussian radial dis- 
tribution function f(r) = e ~'' 2 /2fl 2 with R _ so/i^Mpc. 
The right-hand panel shows the window functions Wfi of the 
bulk flow components Ui for this distribution (dash-dotted, 
short-dashed and long-dashed lines for the x, y and z com- 
ponents, respectively) and the ideal window function (solid 
line). Non-Gaussianity in the distribution of galaxies in the 
left-hand panel causes a slight mismatch between its win- 
dow functions and the ideal one. With a larger number of 
galaxies in the simulations, the Gaussian-weighted moments 
Vi would approach the ideal moments Ui, and give a closer 
match between {(in - Ui) 2 ) s and ((m - Vi) 2 ) s . The DEEP 



catalogue does not have as many close by galaxies as in the 
SFI++ and COMPOSITE catalogues, and thus the variance 
estimates calculated using linear theory [((iu — Ui) 2 )?] and 
the LD and HR simulations [{(ui — Vi) 2 ) 3 ] are significantly 
closer to each other. Taken together with the lack of bias (see 
Fig. [6|, it is clear that non-linear motions are not having a 
significant effect on these large-scale moments. 

The much improved performance of MV formalism over 
the widely used MLE scheme is also evident in Fig. [8] where 
we show the window functions W 2 i of the bulk flow compo- 
nents, calculated using MV (thick) and MLE (thin) meth- 
ods. These window functions correspond to the DEEP (left- 
hand column) and COMPOSITE (right-hand column) real 
catalogues, for R = 20 fe _1 Mpc (top row) and R — 50 
/i _1 Mpc (bottom row). For both DEEP and COMPOS- 
ITE catalogues, the MV window functions are a reasonable 
match to the ideal ones. The MLE window functions are not 
only contaminated by small-scale power, they are also very 
different for the x-, y- and z-directions - making it difficult 
to interpret the MLE bulk flow moments. On the other hand, 
by directly controlling the survey window functions the MV 
formalism effectively suppresses the small-scale contribution 
to the bulk flow. Since it is the small-scales that are predom- 
inantly plagued by non-linear effects, the MV scheme is able 
to make a clean estimate (compared to MLE) of the bulk 
flow components, while keeping the non-linear contamina- 
tion to a minimum. 

In Table [2j columns 1-3, we also show the values of 
the theoretical widths ((m — Ui) 2 )? from the real catalogues 
on which the mocks are based. We see that the theoretical 
widths for the real catalogues (columns 1-3) are somewhat 
larger than the theoretical widths for the mocks (columns 7 
- 9). This is due primarily to the fact that the objects in the 
simulated catalogues are less clumped than in the real cat- 
alogues, even though they have similar radial distribution 
functions. This is evident in Figs [l] and [2] where the mock 
catalogues can be seen as having a relatively featureless spa- 
tial distribution. Less clumping and fewer close by galaxies 
in the simulations lower the MV-weighted bulk flow mo- 
ments m, resulting in somewhat lower widths {(m — Ui) 2 )? 
than the real catalogue widths. The creation of mock cata- 
logues with widths that more closely matched the real cat- 
alogue widths would require simulations with higher resolu- 
tion. 

We also found that the sparser the mock catalogue is 
(eg. DEEP), the higher the chances of getting large velocities 
(see the extended tails in the velocity distributions for the 
DEEP mocks in Fig.|4|, but in a way that is consistent with 
the larger uncertainties associated with the estimators de- 
rived from these mock catalogues. This can be seen by com- 
paring the predicted distribution widths ((w, — Ui) 2 )? for 
the DEEP and COMPOSITE mock catalogues in Table [2] 
columns 7-9. The DEEP mocks, being sparser compared 
to the COMPOSITE mocks, have larger widths. Compar- 
ing the widths of (Ui — Vi) histograms (Table [2] columns 
4-6) found in the simulations (Fig. |6|, we again see that 
the DEEP mocks have marginally larger uncertainties in the 
bulk estimators, as expected. 
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Table 2. The theoretical distribution width — Ui) 2 ) 2 for the real catalogues in the first (x), second (y) and third (z) columns, 
calculated in linear theory using Eq. [5] Eq. |11| Eq. |12| and Eq. |13| In the fourth (x), fifth (y) and sixth (2) columns, we show the 
widths {(ui — Vi) 2 ) 5 of the (u; — Vj) histograms for the LD mocks (see Fig.^, this should be compared to the first three columns. The 
theoretical widths for the LD mocks are shown in the seventh (x), eighth (y) and ninth (z) columns. For the LD mocks, we quote the 
mean and standard deviation values of ((uj — Ui) 2 )?, for the 4100 mock catalogues. These values are based on WMAP7 (Larson et al 
|2011[ l central power spectrum parameters. In the last column we show the width of the distribution of the moments over the 4100 
mock catalogues (see Fig.|4j. Since the widths u x , u y and u z were all found to be very similar, we only quote a single value for m in the 
last column. All values arc in km s _1 . 
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Figure 7. Left: the distribution of galaxies around the location of the centre of a typical mock catalogue. Each galaxy is wei ghte d with 
a Gaussian radial distribution function f(r) = e~ r l 2B - (here R = 50 h~ 1 Mpc). Right: The window functions Wf- (see Eq. 14l of the 
bulk flow components Ui, for R = 50 h~ Mpc. The x,y and z components are dash-dotted, short-dashed, long-dashed lines, respectively, 
and correspond to the distribution in the left-hand panel. The solid line is the ideal window function (since the ideal survey is isotropic, 
all components are the same). 



5 DISCUSSION AND CONCLUSIONS 

In previous papers (Papers I and II), we developed a weight- 
ing scheme for analyzing peculiar velocity surveys that gives 
estimators of idealized bulk flow moments that reflect the 
flow of a volume of a particular scale centred on our loca- 
tion rather than the characteristics of a particular survey. 
Given a peculiar velocity survey, the MV method is capa- 
ble of 'redesigning' the survey window function in a way 
that minimizes the aliasing of small-scale power on to large 
scales, thereby making comparisons with linear theory as 
well as among independent surveys possible. The direct con- 
trol over a survey window function makes the MV formalism 
an extremely useful tool when comparing bulk flow results 
across independent surveys with varying characteristics. 



Using mock catalogues drawn from numerical simula- 
tions, we have demonstrated that the MV formalism, within 
errors, recovers the bulk flow moments of the underlying 
matter distribution and that the MV moments are unbiased 
estimators of the bulk flow of a volume of a given scale, re- 
gardless of the geometry of a particular survey. The MV mo- 
ments are unbiased, in that on average they give the correct 
values for the idealized bulk flow components. We calculated 
the variance of the bulk estimator using (i) linear theory 
((it; — Ui) 2 )? and (ii) numerical simulations ((m — Vi) 2 )? . 
Although the variance calculated using the simulations were 
found to be somewhat different from the linear theory pre- 
dictions, we argued that this is due to the simulations being 
underdense and thus not having enough galaxies. For numer- 
ical simulations with higher resolution (more galaxies), we 
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Figure 8. The window functions Wf; of the bulk flow components 
calculated using MV (thick) weights (see Eq. [sj and MLE (thin) 
weights (see Eq. |3| for R = 20 h~ 1 Mpc (top row) and R. = 
50 fe^'Mpc (bottom row) for the DEEP (left-hand column) and 
COMPOSITE (right-hand column) real catalogues. The x,y and 
z components are dash-dotted, short-dashed, long-dashed lines, 
respectively. The solid line is the ideal window function. 

expect the Gaussian-weighted moments Vi to approach the 
ideal moments Ui and give a much closer match. We found 
the variance estimates using simulations and linear theory 
to be significantly closer to each other for the DEEP cata- 
logue, which has fewer close by galaxies and thus performed 
much better than the SFI++ and COMPOSITE catalogues 
when testing the MV formalism. These results validate our 
use of linear theory in the development of the MV method 
and confirms the fact that non-linear, small-scale motions 
do not significantly affect the MV estimators. 

We tested many facets of the MV formalism and found 
agreement in all the tests we performed using the LD and 
HR simulations. We found that the chance of getting large 
flows (~ 400 km s" 1 ) in a ACDM universe is of order of 
~ 1 per cent. The bulk moments it; estimated using our MV 
formalism are, within errors, the same as the moments Vi 
of the volume as traced by all the galaxies in the simula- 
tion box and linear theory correctly predicts the variance 
of the estimators. Further, since the formalism allows for 
exploration of all scales where there are data, we can reli- 
ably explore flows on many scales and track the dynamics 
of volumes of different scales (parametrized by a radius of a 
Gaussian sphere R). 
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